Skip to content

docs: add scripts/uq-publish.py for understand-quickly registry integration#332

Open
amacsmith wants to merge 2 commits intoDeusData:mainfrom
amacsmith:feat/uq-publish
Open

docs: add scripts/uq-publish.py for understand-quickly registry integration#332
amacsmith wants to merge 2 commits intoDeusData:mainfrom
amacsmith:feat/uq-publish

Conversation

@amacsmith
Copy link
Copy Markdown

Why

looptech-ai/understand-quickly is a public registry of code-knowledge graphs that ships its own MCP server and a stable registry.json API. codebase-memory-mcp is itself an MCP server backed by a persistent code-knowledge graph — the registry is the natural place for those graphs to be discovered by other agents (Claude, Codex, Cursor) without each user pointing at a private endpoint.

This PR ships the integration as a small Python helper (scripts/uq-publish.py) that drives the existing MCP surface — no C changes, no new dependencies in the binary, no rebuild needed. The "single static binary, zero dependencies" property is preserved.

What changes

  • A new scripts/uq-publish.py (stdlib-only — urllib, subprocess, json).
  • A unit-test counterpart scripts/test_uq_publish.py using stdlib unittest.
  • A "Publish to understand-quickly" section in the README near the existing "Team-Shared Graph Artifact" section.

When run, the script:

  1. Spawns the codebase-memory-mcp binary, sends initialize + notifications/initialized + tools/call get_architecture over stdio (the same pattern used by the existing scripts/test_mcp_rapid_init.py).
  2. Projects the response onto a gitnexus@1-shaped JSON graph ({schema, nodes, links}).
  3. Stamps metadata.{tool, tool_version, generated_at, commit} (commit via git rev-parse HEAD).
  4. Writes .codebase-memory/graph.json.
  5. If $UNDERSTAND_QUICKLY_TOKEN is set, fires a repository_dispatch event at looptech-ai/understand-quickly so the registry's sync workflow re-pulls the entry. If unset, only the local file is written — no network call.
python3 scripts/uq-publish.py

The recommended CI step is the looptech-ai/uq-publish-action@v0.1.0 Marketplace Action (which collapses the dispatch step to ~5 lines of YAML).

Diff stats

 README.md                  |  20 +++++
 scripts/test_uq_publish.py |  65 +++++++++++++++
 scripts/uq-publish.py      | 201 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 286 insertions(+)

The publish script is 201 LoC; the rest is tests + docs.

No-op default

The script is entirely opt-in (no auto-invoke, no hook). Running it without UNDERSTAND_QUICKLY_TOKEN writes only the local file. Running it with the token but the repo not yet registered prints a one-line registration hint and exits 0. Failures (network, missing remote, missing binary) never raise — they print a single line and return cleanly so a CI job using this script in a non-publish context isn't broken.

Schema fit

The script writes gitnexus@1 ({nodes, links} arrays plus metadata) — the closest existing schema in the registry. If you'd later prefer a richer cbm@1 schema with shape-specific data (LSP-resolved type info, HTTP route nodes, ADR nodes), the registry has a format-authoring path — happy to co-author it as a follow-up.

Test plan

  • python3 scripts/test_uq_publish.py — 2 tests pass.
  • No changes to C source, no new build artifacts, no impact on the static binary.
  • (Maintainer) python3 scripts/uq-publish.py /path/to/cbm against a real-indexed project writes .codebase-memory/graph.json with stamped metadata.
  • (Maintainer) With UNDERSTAND_QUICKLY_TOKEN set and the repo registered, dispatch fires and the registry's sync.yml runs within ~1 minute.
  • (Maintainer) With the token unset, script prints "skipping dispatch" and returns 0.

Notes

  • This is the lightest-touch integration possible — purely additive, opt-in, and keeps the binary's zero-deps invariant.
  • If you'd prefer the --publish flag baked directly into cbm (a C-side implementation), I'm happy to follow up with a separate PR; the script can serve as the reference implementation.
  • Once a few users land in the registry, we can add DeusData/codebase-memory-mcp to the verified-publisher allowlist for auto-merge of registry updates.
  • Licensing for users. Submitting via the publish path is governed by the Understand-Quickly Data License 1.0. It is opt-in, gated on the user setting UNDERSTAND_QUICKLY_TOKEN.

Links

…ration

Adds a small Python helper (stdlib-only — no new dependencies, no C
changes, no impact on the binary) that:

  1. Drives the existing cbm binary over stdio MCP to extract graph
     contents via `get_architecture`.
  2. Projects to a `gitnexus@1`-shaped JSON file at
     `.codebase-memory/graph.json` with `metadata.{tool, tool_version,
     generated_at, commit}`.
  3. If `UNDERSTAND_QUICKLY_TOKEN` is set, fires a
     `repository_dispatch` at `looptech-ai/understand-quickly` so the
     registry resyncs the entry.

This preserves the "single static binary, zero dependencies" property —
the script is opt-in and lives under `scripts/`, alongside other Python
helpers like `scripts/test_mcp_rapid_init.py`. Two unit tests with
stdlib `unittest`.

Spec: https://github.com/looptech-ai/understand-quickly/blob/main/docs/spec/code-graph-protocol.md
Action: https://github.com/looptech-ai/uq-publish-action
Copilot AI review requested due to automatic review settings May 10, 2026 07:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in, stdlib-only publishing helper to project codebase-memory-mcp graphs into an understand-quickly-compatible artifact (.codebase-memory/graph.json) and optionally trigger a registry resync via GitHub repository_dispatch.

Changes:

  • Added scripts/uq-publish.py to extract graph info via MCP over stdio, write gitnexus@1 JSON, and optionally dispatch a sync event.
  • Added scripts/test_uq_publish.py (stdlib unittest) for basic coverage of graph projection + metadata stamping.
  • Updated README.md with an “Publish to understand-quickly (opt-in)” section and recommended CI action snippet.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
README.md Documents the opt-in publish workflow, including local-only behavior without a token and the suggested CI action.
scripts/uq-publish.py Implements MCP stdio invocation, graph JSON emission, metadata stamping, and optional GitHub dispatch.
scripts/test_uq_publish.py Adds unit tests for metadata stamping and graph projection logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/uq-publish.py Outdated
Comment on lines +74 to +81
for line in proc.stdout.splitlines():
try:
obj = json.loads(line)
except json.JSONDecodeError:
continue
if obj.get("id") == 2:
return obj.get("result", {})
raise RuntimeError(f"no response from {binary} for tool {tool}")
Comment thread scripts/uq-publish.py Outdated
Comment on lines +84 to +99
def build_graph(binary: str, project: str) -> dict:
"""Project the in-memory KG to a `gitnexus@1`-shaped graph using existing MCP tools."""
arch = _mcp_call(binary, "get_architecture", {"project": project})
nodes_raw = arch.get("nodes") or arch.get("modules") or []
edges_raw = arch.get("edges") or arch.get("links") or []
nodes = [
{"id": str(n.get("id", n.get("name", i))),
"label": n.get("name", n.get("label", "")),
"kind": n.get("kind", n.get("type", "module"))}
for i, n in enumerate(nodes_raw)
]
edges = [
{"source": str(e.get("source", e.get("from", ""))),
"target": str(e.get("target", e.get("to", ""))),
"kind": e.get("kind", e.get("type", "depends_on"))}
for e in edges_raw
Comment thread scripts/test_uq_publish.py Outdated
Comment on lines +51 to +61
fake_arch = {
"nodes": [{"id": 1, "name": "AuthService", "kind": "module"}],
"edges": [{"source": 1, "target": 2, "kind": "depends_on"}],
}
with mock.patch.object(uq, "_mcp_call", return_value=fake_arch):
graph = uq.build_graph("/fake/binary", "demo")
self.assertEqual(graph["schema"], "gitnexus@1")
self.assertEqual(len(graph["nodes"]), 1)
self.assertEqual(graph["nodes"][0]["label"], "AuthService")
self.assertEqual(len(graph["links"]), 1)
self.assertEqual(graph["links"][0]["source"], "1")
Comment thread scripts/uq-publish.py Outdated
Comment on lines +40 to +56
def _git(args: list[str], cwd: Path) -> str | None:
try:
r = subprocess.run( # nosec B603
["git", *args], cwd=str(cwd), capture_output=True, text=True,
check=False, timeout=5,
)
except (FileNotFoundError, subprocess.SubprocessError):
return None
return r.stdout.strip() if r.returncode == 0 else None


def _detect_repo_slug(cwd: Path) -> str | None:
url = _git(["remote", "get-url", "origin"], cwd) or ""
for prefix in ("https://github.com/", "git@github.com:"):
if url.startswith(prefix):
slug = url[len(prefix):].removesuffix(".git")
return slug or None
Comment thread scripts/uq-publish.py
Comment on lines +154 to +160
try:
graph = build_graph(args.binary, args.project)
except Exception as exc:
print(f"[uq-publish] could not extract graph via {args.binary}: {exc}",
file=sys.stderr)
return 1

…edges

Address Copilot review on PR DeusData#332:

- _mcp_call now unwraps the MCP `{content:[{type:'text',text:'...'}]}`
  envelope and json.loads the inner text; raises RuntimeError on isError.
  Previously callers saw the raw envelope, so build_graph silently produced
  empty graphs.
- build_graph now uses two query_graph Cypher queries to fetch Module
  nodes and their dependency edges instead of get_architecture, which
  returns counts/summaries (total_nodes, node_labels, edge_types) — never
  a node/edge list.
- Replace Python 3.10+ `str | None` annotations with `Optional[str]` and
  drop `removesuffix` (3.9+) for broader Python compatibility.
- Tests now mock the new _query_rows interface and add explicit coverage
  for the MCP envelope unwrap (success + isError paths).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@amacsmith
Copy link
Copy Markdown
Author

Thanks for the Copilot review — addressed the real correctness findings in 87491f2:

  • MCP envelope unwrap (scripts/uq-publish.py:60-100): _mcp_call now unwraps the {content:[{type:'text',text:'...'}]} shape and json.loads-es the inner text, raising on isError. Previously callers were treating the raw envelope as the decoded payload.
  • get_architecture -> query_graph (build_graph): switched to two Cypher queries against query_graph to actually fetch Module nodes + dependency edges, since get_architecture returns counts/summaries (total_nodes, node_labels, edge_types) rather than a node/edge list.
  • Python 3.9 compatibility: replaced str | None annotations with Optional[str] and dropped str.removesuffix so the helper runs on 3.9+.
  • Tests: now mock the _query_rows interface and added explicit coverage for the envelope unwrap (success + isError paths). All 4 tests pass locally.

Skipped style-only nits (e.g. exit-code wording in PR description). Happy to adjust further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants